The Infinite Regionalized Policy Representation-0.1cm

نویسندگان

  • Miao Liu
  • Xuejun Liao
  • Lawrence Carin
چکیده

We introduce the infinite regionalized policy presentation (iRPR), as a nonparametric policy for reinforcement learning in partially observable Markov decision processes (POMDPs). The iRPR assumes an unbounded set of decision states a priori, and infers the number of states to represent the policy given the experiences. We propose algorithms for learning the number of decision states while maintaining a proper balance between exploration and exploitation. Convergence analysis is provided, along with performance evaluations on benchmark problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Infinite product representation of solution of indefinite SturmLiouville problem

In this paper, we investigate infinite product representation of the solution of a Sturm- Liouville equation with an indefinite weight function which has two zeros and/or singularities in a finite interval. First, by using of the asymptotic estimates provided in [W. Eberhard, G. Freiling, K. Wilcken-Stoeber, Indefinite eigenvalue problems with several singular points and turning points, Math. N...

متن کامل

Study of Solute Dispersion with Source/Sink Impact in Semi-Infinite Porous Medium

Mathematical models for pollutant transport in semi-infinite aquifers are based on the advection-dispersion equation (ADE) and its variants. This study employs the ADE incorporating time-dependent dispersion and velocity and space-time dependent source and sink, expressed by one function. The dispersion theory allows mechanical dispersion to be directly proportional to seepage velocity. Initial...

متن کامل

Regionalized Policy Representation for Reinforcement Learning in POMDPs

Many decision-making problems can be formulated in the framework of a partially observable Markov decision process (POMDP) [5]. The optimality of decisions relies on the accuracy of the POMDP model as well as the policy found for the model. In many applications the model is unknown and learned empirically based on experience, and building a model is just as difficult as finding the associated p...

متن کامل

Concepts and Application of Three Dimensional Infinite Elements to Soil Structure-Interaction Problems

This study is concerned with the formulation of three dimensional mapped infinite elements with 1/r and 1/ decay types. These infinite elements are coupled with conventional finite elements and their application to some problems of soil structure interaction are discussed. The effeciency of the coupled finite-infinite elements formulation with respect to computational effort, data preparation a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011